34 research outputs found

    Energy effective issue logic

    Get PDF
    The issue logic of a dynamically-scheduled superscalar processor is a complex mechanism devoted to start the execution of multiple instructions every cycle. Due to its complexity, it is responsible for a significant percentage of the energy consumed by a microprocessor. The energy consumption of the issue logic depends on several architectural parameters, the instruction issue queue size being one of the most important. In this paper we present a technique to reduce the energy consumption of the issue logic of a high-performance superscalar processor. The proposed technique is based on the observation that the conventional issue logic wastes a significant amount of energy for useless activity. In particular, the wake-up of empty entries and operands that are ready represents an important source of energy waste. Besides, we propose a mechanism to dynamically reduce the effective size of the instruction queue. We show that on average the effective instruction queue size can be reduced by a factor of 26% with minimal impact on performance. This reduction together with the energy saved for empty and ready entries result in about 90.7% reduction in the energy consumed by the wake-up logic, which represents 14.9% of the total energy of the assumed processor.Peer ReviewedPostprint (published version

    Mechanisms for cooperative shared memory

    Get PDF
    This paper explores the complexity of implementing directory protocols by examining their mechanisms - primitive operations on directories, caches, and network interfaces. We compare the following protocols: Dir1B, Dir4B, Dir4NB, DirnNB, Dir1SW and an improved version of Dir1SW (Dir1SW+). The comparison shows that the mechanisms and mechanism sequencing of Dir1SW and Dir1SW+ are simpler than those for other protocols. We also compare protocol performance by running eight benchmarks on 32 processor systems. Simulations show that Dir1SW+'s performance is comparable to more complex directory protocols. The significant disparity in hardware complexity and the small difference in performance argue that Dir1SW+ may be a more effective use of resources. The small performance difference is attributable to two factors: the low degree of sharing in the benchmarks and Check-In/Check-Out (CICO) directives

    Exploiting Idle Floating-Point Resources For Integer Execution

    No full text
    In conventional superscalar microarchitectures with partitioned integer and floating-point resources, all floating-point resources are idle during execution of integer programs. Palacharla and Smith [26] addressed this drawback and proposed that the floating-point subsystem be augmented to support integer operations. The hardware changes required are expected to be fairly minimal. To exploit thes

    Evaluating stream buffers as a secondary cache replacement

    No full text

    Nonlinear Joint Transform Correlator with a Multiple Quantum Well Photorefractive Device

    Get PDF
    A nonlinear joint transform correlator incorporating a GaAs/GaA1AsMQW device is described and demonstrated for face recognition. The correlator emphasized high frequency components in the input and shows good tolerance to variation in facial expression.On d\ue9crit un corr\ue9lateur de transformations conjointes non lin\ue9aires incorporant un dispositif \ue0 puits quantiques GaAs/GaA1As et on en fait la d\ue9monstration pour la reconnaissance des visages. Le corr\ue9lateur fait ressortir les composantes haute fr\ue9quence du signal d'entr\ue9e et il montre une bonne tol\ue9rance aux variations des expressions faciales.NRC publication: Ye

    Nonlinear Joint Transform Correlator with a Multiple Quantum Well Photorefractive Device

    No full text
    A nonlinear joint transform correlator incorporating a GaAs/GaA1AsMQW device is described and demonstrated for face recognition. The correlator emphasized high frequency components in the input and shows good tolerance to variation in facial expression.On d\ue9crit un corr\ue9lateur de transformations conjointes non lin\ue9aires incorporant un dispositif \ue0 puits quantiques GaAs/GaA1As et on en fait la d\ue9monstration pour la reconnaissance des visages. Le corr\ue9lateur fait ressortir les composantes haute fr\ue9quence du signal d'entr\ue9e et il montre une bonne tol\ue9rance aux variations des expressions faciales.NRC publication: Ye

    Rescue

    No full text
    corecore